The goals / steps of this project are the following:
References and Attributions:
Answer: I used a lot of experimentation of HOG paramters to find those parameters that gave the best results in extracting HOG features.
Some of the parameters I experimented with were: Color Space, Orientations, Pixels_per_cell and Cells_per_block. The reason for experimenting with color spaces is because I noticed that with certain color spaces were able to detect white cars better. I also increased the number of orientations because I wanted to provide more granularity. I also decreased the number of cells per block because I found that when I increased the number of cells per block, the bounding boxes were not as tight.
My consideration for selecting the HOG Parameters I chose for processing the pipeline was based on first the accuracy of the SVM classifier in detecting cars correctly and second reducing the number of false positives. Another key consideration in choosing the HOG parameters was how tight the bounding boxes were and how much each parameter contributed to the tightening the box where the vehicle is detected.
Answer:
First, I read in the training data images for 8792 cars and 8968 notcars. Then, I use extract_features to extract HOG features for cars_features and notcar_features.
Here is an example of one of each of the vehicle and non-vehicle classes:

I then explored different color spaces and different skimage.hog() parameters (orientations, pixels_per_cell, and cells_per_block). I grabbed random images from each of the two classes and displayed them to get a feel for what the skimage.hog() output looks like.
Here is an example using the YCrCb color space and HOG parameters of orientations=8, pixels_per_cell=(8, 8) and cells_per_block=(2, 2):

After I extracted the HOG features from the training data, I used those features to train a Support Vector Classifier (SVC). Before I fed the features to the SVC, I scaled the features using StandardScaler and I split the data into randomized training and test sets.
Using RGB color space, I found that it tended to find "dark" cars but miss white ones. Using RGB color space also tended to result in false positives where it put bounding boxes around dark patches where trees may look like cars. With RGB colorspace I got a test accuracy for the SVC of 0.9068.
After experimenting with a number of different parameters, I found the following parameters:
The color space that seemed to yield the best result was YCrCb. With YCrCb colorspace I got a test accuracy for the SVC of 0.9894 I searched on two scales using YCrCb 3-channel HOG features plus spatially binned color and histograms of color in the feature vector. Here are some example images:

# Imports
from skimage.feature import hog
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
# for scikit-learn >= 0.18 use:
from sklearn.model_selection import train_test_split
# from sklearn.cross_validation import train_test_split
from scipy.ndimage.measurements import label
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
from moviepy.editor import VideoFileClip
from IPython.display import HTML
import numpy as np
import pickle
import cv2
import glob
import time
%matplotlib inline
from skimage.feature import hog
# Explore Car_NotCar dataset
car_images = glob.glob('training_dataset/vehicles/**/*.png')
noncar_images = glob.glob('training_dataset/non-vehicles/**/*.png')
print(len(car_images), len(noncar_images))
fig, axs = plt.subplots(8,8, figsize=(16, 16))
fig.subplots_adjust(hspace = .2, wspace=.001)
axs = axs.ravel()
# Step through the list and search for chessboard corners
for i in np.arange(32):
img = cv2.imread(car_images[np.random.randint(0,len(car_images))])
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
axs[i].axis('off')
axs[i].set_title('car', fontsize=10)
axs[i].imshow(img)
for i in np.arange(32,64):
img = cv2.imread(noncar_images[np.random.randint(0,len(noncar_images))])
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
axs[i].axis('off')
axs[i].set_title('nope', fontsize=10)
axs[i].imshow(img)
image = mpimg.imread('examples/bbox-example-image.jpg') plt.imshow(image)
image = mpimg.imread('examples/bbox-example-image.jpg')
plt.imshow(image)
# Define a function that takes an image, a list of bounding boxes,
# and optional color tuple and line thickness as inputs
# then draws boxes in that color on the output
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
draw_img = np.copy(img)
# Iterate through the bounding boxes
for bbox in bboxes:
# Draw a rectangle given bbox coordinates
cv2.rectangle(draw_img, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return draw_img
# Here are the bounding boxes I used
bboxes = [((275, 572), (380, 510)), ((488, 563), (549, 518)), ((554, 543), (582, 522)),
((601, 555), (646, 522)), ((657, 545), (685, 517)), ((849, 678), (1135, 512))]
result = draw_boxes(image, bboxes)
plt.imshow(result)
# Template Matching on Example Image
image = mpimg.imread('examples/bbox-example-image.jpg')
plt.imshow(image)
# Define a function to search for template matches
# and return a list of bounding boxes
image = mpimg.imread('examples/bbox-example-image.jpg')
#image = mpimg.imread('temp-matching-example-2.jpg')
templist = ['cutouts/cutout1.jpg', 'cutouts/cutout2.jpg', 'cutouts/cutout3.jpg',
'cutouts/cutout4.jpg', 'cutouts/cutout5.jpg', 'cutouts/cutout6.jpg']
def find_matches(img, template_list):
# Define an empty list to take bbox coords
bbox_list = []
# Define matching method
# Other options include: cv2.TM_CCORR_NORMED', 'cv2.TM_CCOEFF', 'cv2.TM_CCORR',
# 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED'
method = cv2.TM_CCOEFF_NORMED
# Iterate through template list
for temp in template_list:
# Read in templates one by one
tmp = mpimg.imread(temp)
# Use cv2.matchTemplate() to search the image
result = cv2.matchTemplate(img, tmp, method)
# Use cv2.minMaxLoc() to extract the location of the best match
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
# Determine a bounding box for the match
w, h = (tmp.shape[1], tmp.shape[0])
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
top_left = min_loc
else:
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
# Append bbox position to list
bbox_list.append((top_left, bottom_right))
# Return the list of bounding boxes
return bbox_list
bboxes = find_matches(image, templist)
result = draw_boxes(image, bboxes)
plt.imshow(result)
# Histogram of Colors
import numpy as np
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
image = mpimg.imread('cutouts/cutout1.jpg')
# Define a function to compute color histogram features
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the RGB channels separately
rhist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
ghist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
bhist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Generating bin centers
bin_edges = rhist[1]
bin_centers = (bin_edges[1:] + bin_edges[0:len(bin_edges)-1])/2
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((rhist[0], ghist[0], bhist[0]))
# Return the individual histograms, bin_centers and feature vector
return rhist, ghist, bhist, bin_centers, hist_features
rh, gh, bh, bincen, feature_vec = color_hist(image, nbins=32, bins_range=(0, 256))
# Plot a figure with all three bar charts
if rh is not None:
fig = plt.figure(figsize=(12,3))
plt.subplot(131)
plt.bar(bincen, rh[0])
plt.xlim(0, 256)
plt.title('R Histogram')
plt.subplot(132)
plt.bar(bincen, gh[0])
plt.xlim(0, 256)
plt.title('G Histogram')
plt.subplot(133)
plt.bar(bincen, bh[0])
plt.xlim(0, 256)
plt.title('B Histogram')
fig.tight_layout()
else:
print('Your function is returning None for at least one variable...')
# Spatial Binning of Color
# Read in an image
# You can also read cutout2, 3, 4 etc. to see other examples
image = mpimg.imread('cutouts/cutout1.jpg')
# Define a function to compute color histogram features
# Pass the color_space flag as 3-letter all caps string
# like 'HSV' or 'LUV' etc.
# KEEP IN MIND IF YOU DECIDE TO USE THIS FUNCTION LATER
# IN YOUR PROJECT THAT IF YOU READ THE IMAGE WITH
# cv2.imread() INSTEAD YOU START WITH BGR COLOR!
def bin_spatial(img, color_space='RGB', size=(32, 32)):
# Convert image to new color space (if specified)
# Use cv2.resize().ravel() to create the feature vector
features = img.ravel() # Remove this line!
# Return the feature vector
return features
feature_vec = bin_spatial(image, color_space='RGB', size=(32, 32))
# Plot features
plt.plot(feature_vec)
plt.title('Spatially Binned Features')
# Data Exploration of Small Car_NotCar image set
#images = glob.glob('*.jpeg')
#images = glob.glob('./test_images_small/test1.jpg')
images = glob.glob('test_images_small/*/*.jpg')
print ("number of images=",len(images))
car_images = glob.glob('training_dataset/vehicles/**/*.png')
noncar_images = glob.glob('training_dataset/non-vehicles/**/*.png')
print(len(car_images), len(noncar_images))
cars = []
notcars = []
for image in images:
print ("Test Car_NotCar images", image)
# mpimg.imread(image)
if 'cars' in image:
cars.append(image)
elif 'image' in image or 'extra' in image:
notcars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# Define data_look function
# Define a function to return some characteristics of the dataset
def data_look(car_list, notcar_list):
data_dict = {}
# Define a key in data_dict "n_cars" and store the number of car images
data_dict["n_cars"] = len(car_list)
# Define a key "n_notcars" and store the number of notcar images
data_dict["n_notcars"] = len(notcar_list)
# Read in a test image, either car or notcar
# example_img = mpimg.imread(car_list[0])
example_img = mpimg.imread(car_list[0])
# Define a key "image_shape" and store the test image shape 3-tuple
data_dict["image_shape"] = example_img.shape
# Define a key "data_type" and store the data type of the test image.
data_dict["data_type"] = example_img.dtype
# Return data_dict
return data_dict
data_info = data_look(cars, notcars)
print('Your function returned a count of',
data_info["n_cars"], ' cars and',
data_info["n_notcars"], ' non-cars')
print('of size: ',data_info["image_shape"], ' and data type:',
data_info["data_type"])
# Just for fun choose random car / not-car indices and plot example images
car_ind = np.random.randint(0, len(cars))
notcar_ind = np.random.randint(0, len(notcars))
#car_ind = 0
#notcar_ind = 0
# Read in car / not-car images
car_image = mpimg.imread(cars[car_ind])
notcar_image = mpimg.imread(notcars[notcar_ind])
# Plot the examples
#fig = plt.figure(figsize=(10,10))
fig = plt.figure()
fig = plt.figure(figsize=(20,20))
#fig = plt.figure()
plt.subplot(121)
plt.imshow(car_image)
plt.title('Example Car Image')
plt.subplot(122)
plt.imshow(notcar_image)
plt.title('Example Not-car Image')
plt.show()
# Visualize HOG features
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
from skimage.feature import hog
# Read in our vehicles and non-vehicles
images = glob.glob('./test_images_small/*/*.jpg')
cars = []
notcars = []
for image in images:
if 'cars' in image:
cars.append(image)
elif 'image' in image or 'extra' in image:
notcars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# Define a function to return HOG features and visualization
def get_hog_features(img, orient, pix_per_cell, cell_per_block, vis=False, feature_vec=True):
if vis == True:
features, hog_image = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=False,
visualise=True, feature_vector=False)
return features, hog_image
else:
features = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=False,
visualise=False, feature_vector=feature_vec)
return features
# Generate a random index to look at a car image
ind = np.random.randint(0, len(cars))
# Read in the image
image = mpimg.imread(cars[ind])
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# Define HOG parameters
orient = 9
pix_per_cell = 8
cell_per_block = 2
# Call our function with vis=True to see an image output
features, hog_image = get_hog_features(gray, orient,
pix_per_cell, cell_per_block,
vis=True, feature_vec=False)
# Plot the examples
fig = plt.figure()
fig = plt.figure(figsize=(30,30))
#plt.subplot(121)
plt.subplot(211)
plt.imshow(image, cmap='gray')
plt.title('Example Car Image')
#plt.subplot(122)
plt.subplot(212)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
plt.show()
# Sliding Window
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
# NOTE: the next import is only valid
# for scikit-learn version <= 0.17
# if you are using scikit-learn >= 0.18 then use this:
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation import train_test_split
# Define a function to compute binned color features
def bin_spatial(img, size=(32, 32)):
# Use cv2.resize().ravel() to create the feature vector
features = cv2.resize(img, size).ravel()
# Return the feature vector
return features
# Define a function to compute color histogram features
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the color channels separately
channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
# Return the individual histograms, bin_centers and feature vector
return hist_features
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, cspace='RGB', spatial_size=(32, 32),
hist_bins=32, hist_range=(0, 256)):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if cspace != 'RGB':
if cspace == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif cspace == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif cspace == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif cspace == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
else: feature_image = np.copy(image)
# Apply bin_spatial() to get spatial color features
spatial_features = bin_spatial(feature_image, size=spatial_size)
# Apply color_hist() also with a color space option now
hist_features = color_hist(feature_image, nbins=hist_bins, bins_range=hist_range)
# Append the new feature vector to the features list
features.append(np.concatenate((spatial_features, hist_features)))
# Return list of feature vectors
return features
# Read in car and non-car images
#images = glob.glob('test_images_small/*/*.jpg')
images = glob.glob('training_dataset/**/**/*.png')
cars = []
notcars = []
for image in images:
if 'non-vehicles' in image:
notcars.append(image)
elif 'vehicles' in image:
cars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# TODO play with these values to see how your classifier
# performs under different binning scenarios
spatial = 32
histbin = 32
car_features = extract_features(cars, cspace='RGB', spatial_size=(spatial, spatial),
hist_bins=histbin, hist_range=(0, 256))
notcar_features = extract_features(notcars, cspace='RGB', spatial_size=(spatial, spatial),
hist_bins=histbin, hist_range=(0, 256))
# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
scaled_X, y, test_size=0.2, random_state=rand_state)
print('Using spatial binning of:',spatial,
'and', histbin,'histogram bins')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t=time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')
# Combine and Normalize Features
import numpy as np
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from sklearn.preprocessing import StandardScaler
import glob
# Define a function to compute binned color features
def bin_spatial(img, size=(32, 32)):
# Use cv2.resize().ravel() to create the feature vector
features = cv2.resize(img, size).ravel()
# Return the feature vector
return features
# Define a function to compute color histogram features
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the color channels separately
channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
# Return the individual histograms, bin_centers and feature vector
return hist_features
###### Extract Features ###########
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, cspace='RGB', spatial_size=(32, 32),
hist_bins=32, hist_range=(0, 256)):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if cspace != 'RGB':
if cspace == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif cspace == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif cspace == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif cspace == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
else: feature_image = np.copy(image)
# Apply bin_spatial() to get spatial color features
spatial_features = bin_spatial(feature_image, size=spatial_size)
# Apply color_hist() also with a color space option now
hist_features = color_hist(feature_image, nbins=hist_bins, bins_range=hist_range)
# Append the new feature vector to the features list
features.append(np.concatenate((spatial_features, hist_features)))
# Return list of feature vectors
return features
images = glob.glob('test_images_small/*/*.jpg')
cars = []
notcars = []
for image in images:
if 'cars' in image:
cars.append(image)
elif 'image' in image or 'extra' in image:
notcars.append(image)
else:
cars.append(image)
car_features = extract_features(cars, cspace='RGB', spatial_size=(32, 32),
hist_bins=32, hist_range=(0, 256))
notcar_features = extract_features(notcars, cspace='RGB', spatial_size=(32, 32),
hist_bins=32, hist_range=(0, 256))
if len(car_features) > 0:
# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
car_ind = np.random.randint(0, len(cars))
# Plot an example of raw and scaled features
fig = plt.figure(figsize=(12,4))
plt.subplot(131)
plt.imshow(mpimg.imread(cars[car_ind]))
plt.title('Original Image')
plt.subplot(132)
plt.plot(X[car_ind])
plt.title('Raw Features')
plt.subplot(133)
plt.plot(scaled_X[car_ind])
plt.title('Normalized Features')
fig.tight_layout()
else:
print('Your function only returns empty feature vectors...')
# Color Classify
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
# NOTE: the next import is only valid
# for scikit-learn version <= 0.17
# if you are using scikit-learn >= 0.18 then use this:
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation import train_test_split
# Define a function to compute binned color features
def bin_spatial(img, size=(32, 32)):
# Use cv2.resize().ravel() to create the feature vector
features = cv2.resize(img, size).ravel()
# Return the feature vector
return features
# Define a function to compute color histogram features
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the color channels separately
channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
# Return the individual histograms, bin_centers and feature vector
return hist_features
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, cspace='RGB', spatial_size=(32, 32),
hist_bins=32, hist_range=(0, 256)):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if cspace != 'RGB':
if cspace == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif cspace == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif cspace == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif cspace == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
else: feature_image = np.copy(image)
# Apply bin_spatial() to get spatial color features
spatial_features = bin_spatial(feature_image, size=spatial_size)
# Apply color_hist() also with a color space option now
hist_features = color_hist(feature_image, nbins=hist_bins, bins_range=hist_range)
# Append the new feature vector to the features list
features.append(np.concatenate((spatial_features, hist_features)))
# Return list of feature vectors
return features
# Read in car and non-car images
#images = glob.glob('test_images_small/*/*.jpg')
images = glob.glob('training_dataset/**/**/*.png')
cars = []
notcars = []
for image in images:
if 'non-vehicles' in image:
notcars.append(image)
elif 'vehicles' in image:
cars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# TODO play with these values to see how your classifier
# performs under different binning scenarios
spatial = 32
histbin = 32
car_features = extract_features(cars, cspace='RGB', spatial_size=(spatial, spatial),
hist_bins=histbin, hist_range=(0, 256))
notcar_features = extract_features(notcars, cspace='RGB', spatial_size=(spatial, spatial),
hist_bins=histbin, hist_range=(0, 256))
# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
scaled_X, y, test_size=0.2, random_state=rand_state)
print('Using spatial binning of:',spatial,
'and', histbin,'histogram bins')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t=time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')
# Load Training Data
car_images = glob.glob('training_dataset/vehicles/**/*.png')
noncar_images = glob.glob('training_dataset/non-vehicles/**/*.png')
print(len(car_images), len(noncar_images))
# HOG Classify
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from skimage.feature import hog
# NOTE: the next import is only valid for scikit-learn version <= 0.17
# for scikit-learn >= 0.18 use:
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation import train_test_split
# Define a function to return HOG features and visualization
def get_hog_features(img, orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True):
# Call with two outputs if vis==True
if vis == True:
features, hog_image = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features, hog_image
# Otherwise call with one output
else:
features = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, cspace='RGB', orient=9,
pix_per_cell=8, cell_per_block=2, hog_channel=0):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if cspace != 'RGB':
if cspace == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif cspace == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif cspace == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif cspace == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
elif cspace == 'YCrCb':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb)
else: feature_image = np.copy(image)
# Call get_hog_features() with vis=False, feature_vec=True
if hog_channel == 'ALL':
hog_features = []
for channel in range(feature_image.shape[2]):
hog_features.append(get_hog_features(feature_image[:,:,channel],
orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True))
hog_features = np.ravel(hog_features)
else:
hog_features = get_hog_features(feature_image[:,:,hog_channel], orient,
pix_per_cell, cell_per_block, vis=False, feature_vec=True)
# Append the new feature vector to the features list
features.append(hog_features)
# Return list of feature vectors
return features
# Divide up into cars and notcars
images = glob.glob('training_dataset/**/**/*.png')
cars = []
notcars = []
for image in images:
if 'non-vehicles' in image:
notcars.append(image)
elif 'vehicles' in image:
cars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# Reduce the sample size because HOG features are slow to compute
# The quiz evaluator times out after 13s of CPU time
sample_size = 500
cars = cars[0:sample_size]
notcars = notcars[0:sample_size]
### TODO: Tweak these parameters and see how the results change.
colorspace = 'RGB' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 9
pix_per_cell = 8
cell_per_block = 2
hog_channel = 0 # Can be 0, 1, 2, or "ALL"
t=time.time()
car_features = extract_features(cars, cspace=colorspace, orient=orient,
pix_per_cell=pix_per_cell, cell_per_block=cell_per_block,
hog_channel=hog_channel)
notcar_features = extract_features(notcars, cspace=colorspace, orient=orient,
pix_per_cell=pix_per_cell, cell_per_block=cell_per_block,
hog_channel=hog_channel)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to extract HOG features...')
# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
scaled_X, y, test_size=0.2, random_state=rand_state)
print('Using:',orient,'orientations',pix_per_cell,
'pixels per cell and', cell_per_block,'cells per block')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t=time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')
import matplotlib.image as mpimg
import numpy as np
import cv2
from skimage.feature import hog
# Define a function to return HOG features and visualization
def get_hog_features(img, orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True):
# Call with two outputs if vis==True
if vis == True:
features, hog_image = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features, hog_image
# Otherwise call with one output
else:
features = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features
# Define a function to compute binned color features
def bin_spatial(img, size=(32, 32)):
# Use cv2.resize().ravel() to create the feature vector
features = cv2.resize(img, size).ravel()
# Return the feature vector
return features
# Define a function to compute color histogram features
# NEED TO CHANGE bins_range if reading .png files with mpimg!
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the color channels separately
channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
# Return the individual histograms, bin_centers and feature vector
return hist_features
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, color_space='RGB', spatial_size=(32, 32),
hist_bins=32, orient=9,
pix_per_cell=8, cell_per_block=2, hog_channel=0,
spatial_feat=True, hist_feat=True, hog_feat=True):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
file_features = []
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if color_space != 'RGB':
if color_space == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif color_space == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif color_space == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif color_space == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
elif color_space == 'YCrCb':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb)
else: feature_image = np.copy(image)
if spatial_feat == True:
spatial_features = bin_spatial(feature_image, size=spatial_size)
file_features.append(spatial_features)
if hist_feat == True:
# Apply color_hist()
hist_features = color_hist(feature_image, nbins=hist_bins)
file_features.append(hist_features)
if hog_feat == True:
# Call get_hog_features() with vis=False, feature_vec=True
if hog_channel == 'ALL':
hog_features = []
for channel in range(feature_image.shape[2]):
hog_features.append(get_hog_features(feature_image[:,:,channel],
orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True))
hog_features = np.ravel(hog_features)
else:
hog_features = get_hog_features(feature_image[:,:,hog_channel], orient,
pix_per_cell, cell_per_block, vis=False, feature_vec=True)
# Append the new feature vector to the features list
file_features.append(hog_features)
features.append(np.concatenate(file_features))
# Return list of feature vectors
return features
# Define a function that takes an image,
# start and stop positions in both x and y,
# window size (x and y dimensions),
# and overlap fraction (for both x and y)
def slide_window(img, x_start_stop=[None, None], y_start_stop=[None, None],
xy_window=(64, 64), xy_overlap=(0.5, 0.5)):
# If x and/or y start/stop positions not defined, set to image size
if x_start_stop[0] == None:
x_start_stop[0] = 0
if x_start_stop[1] == None:
x_start_stop[1] = img.shape[1]
if y_start_stop[0] == None:
y_start_stop[0] = 0
if y_start_stop[1] == None:
y_start_stop[1] = img.shape[0]
# Compute the span of the region to be searched
xspan = x_start_stop[1] - x_start_stop[0]
yspan = y_start_stop[1] - y_start_stop[0]
# Compute the number of pixels per step in x/y
nx_pix_per_step = np.int(xy_window[0]*(1 - xy_overlap[0]))
ny_pix_per_step = np.int(xy_window[1]*(1 - xy_overlap[1]))
# Compute the number of windows in x/y
nx_buffer = np.int(xy_window[0]*(xy_overlap[0]))
ny_buffer = np.int(xy_window[1]*(xy_overlap[1]))
nx_windows = np.int((xspan-nx_buffer)/nx_pix_per_step)
ny_windows = np.int((yspan-ny_buffer)/ny_pix_per_step)
# Initialize a list to append window positions to
window_list = []
# Loop through finding x and y window positions
# Note: you could vectorize this step, but in practice
# you'll be considering windows one by one with your
# classifier, so looping makes sense
for ys in range(ny_windows):
for xs in range(nx_windows):
# Calculate window position
startx = xs*nx_pix_per_step + x_start_stop[0]
endx = startx + xy_window[0]
starty = ys*ny_pix_per_step + y_start_stop[0]
endy = starty + xy_window[1]
# Append window position to list
window_list.append(((startx, starty), (endx, endy)))
# Return the list of windows
return window_list
# Define a function to draw bounding boxes
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
imcopy = np.copy(img)
# Iterate through the bounding boxes
for bbox in bboxes:
# Draw a rectangle given bbox coordinates
cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return imcopy
# Extract Features
#
import matplotlib.image as mpimg
import numpy as np
import cv2
from skimage.feature import hog
# Define a function to return HOG features and visualization
def get_hog_features(img, orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True):
# Call with two outputs if vis==True
if vis == True:
features, hog_image = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features, hog_image
# Otherwise call with one output
else:
features = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=True,
visualise=vis, feature_vector=feature_vec)
return features
# Define a function to compute binned color features
def bin_spatial(img, size=(32, 32)):
# Use cv2.resize().ravel() to create the feature vector
features = cv2.resize(img, size).ravel()
# Return the feature vector
return features
# Define a function to compute color histogram features
# NEED TO CHANGE bins_range if reading .png files with mpimg!
def color_hist(img, nbins=32, bins_range=(0, 256)):
# Compute the histogram of the color channels separately
channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
# Concatenate the histograms into a single feature vector
hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
# Return the individual histograms, bin_centers and feature vector
return hist_features
# Define a function to extract features from a list of images
# Have this function call bin_spatial() and color_hist()
def extract_features(imgs, color_space='RGB', spatial_size=(32, 32),
hist_bins=32, orient=9,
pix_per_cell=8, cell_per_block=2, hog_channel=0,
spatial_feat=True, hist_feat=True, hog_feat=True):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
file_features = []
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if color_space != 'RGB':
if color_space == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif color_space == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif color_space == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif color_space == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
elif color_space == 'YCrCb':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb)
else: feature_image = np.copy(image)
if spatial_feat == True:
spatial_features = bin_spatial(feature_image, size=spatial_size)
file_features.append(spatial_features)
if hist_feat == True:
# Apply color_hist()
hist_features = color_hist(feature_image, nbins=hist_bins)
file_features.append(hist_features)
if hog_feat == True:
# Call get_hog_features() with vis=False, feature_vec=True
if hog_channel == 'ALL':
hog_features = []
for channel in range(feature_image.shape[2]):
hog_features.append(get_hog_features(feature_image[:,:,channel],
orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True))
hog_features = np.ravel(hog_features)
else:
hog_features = get_hog_features(feature_image[:,:,hog_channel], orient,
pix_per_cell, cell_per_block, vis=False, feature_vec=True)
# Append the new feature vector to the features list
file_features.append(hog_features)
features.append(np.concatenate(file_features))
# Return list of feature vectors
return features
# Define a function that takes an image,
# start and stop positions in both x and y,
# window size (x and y dimensions),
# and overlap fraction (for both x and y)
def slide_window(img, x_start_stop=[None, None], y_start_stop=[None, None],
xy_window=(64, 64), xy_overlap=(0.5, 0.5)):
# If x and/or y start/stop positions not defined, set to image size
if x_start_stop[0] == None:
x_start_stop[0] = 0
if x_start_stop[1] == None:
x_start_stop[1] = img.shape[1]
if y_start_stop[0] == None:
y_start_stop[0] = 0
if y_start_stop[1] == None:
y_start_stop[1] = img.shape[0]
# Compute the span of the region to be searched
xspan = x_start_stop[1] - x_start_stop[0]
yspan = y_start_stop[1] - y_start_stop[0]
# Compute the number of pixels per step in x/y
nx_pix_per_step = np.int(xy_window[0]*(1 - xy_overlap[0]))
ny_pix_per_step = np.int(xy_window[1]*(1 - xy_overlap[1]))
# Compute the number of windows in x/y
nx_buffer = np.int(xy_window[0]*(xy_overlap[0]))
ny_buffer = np.int(xy_window[1]*(xy_overlap[1]))
nx_windows = np.int((xspan-nx_buffer)/nx_pix_per_step)
ny_windows = np.int((yspan-ny_buffer)/ny_pix_per_step)
# Initialize a list to append window positions to
window_list = []
# Loop through finding x and y window positions
# Note: you could vectorize this step, but in practice
# you'll be considering windows one by one with your
# classifier, so looping makes sense
for ys in range(ny_windows):
for xs in range(nx_windows):
# Calculate window position
startx = xs*nx_pix_per_step + x_start_stop[0]
endx = startx + xy_window[0]
starty = ys*ny_pix_per_step + y_start_stop[0]
endy = starty + xy_window[1]
# Append window position to list
window_list.append(((startx, starty), (endx, endy)))
# Return the list of windows
return window_list
# Define a function to draw bounding boxes
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
imcopy = np.copy(img)
# Iterate through the bounding boxes
for bbox in bboxes:
# Draw a rectangle given bbox coordinates
cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return imcopy
# Search and Classify
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from skimage.feature import hog
#from lesson_functions import *
# NOTE: the next import is only valid for scikit-learn version <= 0.17
# for scikit-learn >= 0.18 use:
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation import train_test_split
# Define a function to extract features from a single image window
# This function is very similar to extract_features()
# just for a single image rather than list of images
# Define a function to extract features from a single image window
# This function is very similar to extract_features()
# just for a single image rather than list of images
def single_img_features(img, color_space='RGB', spatial_size=(32, 32),
hist_bins=32, orient=9,
pix_per_cell=8, cell_per_block=2, hog_channel=0,
spatial_feat=True, hist_feat=True, hog_feat=True):
#1) Define an empty list to receive features
img_features = []
#2) Apply color conversion if other than 'RGB'
if color_space != 'RGB':
if color_space == 'HSV':
feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
elif color_space == 'LUV':
feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2LUV)
elif color_space == 'HLS':
feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HLS)
elif color_space == 'YUV':
feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)
elif color_space == 'YCrCb':
feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb)
else: feature_image = np.copy(img)
#3) Compute spatial features if flag is set
if spatial_feat == True:
spatial_features = bin_spatial(feature_image, size=spatial_size)
#4) Append features to list
img_features.append(spatial_features)
#5) Compute histogram features if flag is set
if hist_feat == True:
hist_features = color_hist(feature_image, nbins=hist_bins)
#6) Append features to list
img_features.append(hist_features)
#7) Compute HOG features if flag is set
if hog_feat == True:
if hog_channel == 'ALL':
hog_features = []
for channel in range(feature_image.shape[2]):
hog_features.extend(get_hog_features(feature_image[:,:,channel],
orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True))
else:
hog_features = get_hog_features(feature_image[:,:,hog_channel], orient,
pix_per_cell, cell_per_block, vis=False, feature_vec=True)
#8) Append features to list
img_features.append(hog_features)
#9) Return concatenated array of features
return np.concatenate(img_features)
# Define a function you will pass an image
# and the list of windows to be searched (output of slide_windows())
def search_windows(img, windows, clf, scaler, color_space='RGB',
spatial_size=(32, 32), hist_bins=32,
hist_range=(0, 256), orient=9,
pix_per_cell=8, cell_per_block=2,
hog_channel=0, spatial_feat=True,
hist_feat=True, hog_feat=True):
#1) Create an empty list to receive positive detection windows
on_windows = []
#2) Iterate over all windows in the list
for window in windows:
#3) Extract the test window from original image
test_img = cv2.resize(img[window[0][1]:window[1][1], window[0][0]:window[1][0]], (64, 64))
#4) Extract features for that window using single_img_features()
features = single_img_features(test_img, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
#5) Scale extracted features to be fed to classifier
test_features = scaler.transform(np.array(features).reshape(1, -1))
#6) Predict using your classifier
prediction = clf.predict(test_features)
#7) If positive (prediction == 1) then save the window
if prediction == 1:
on_windows.append(window)
#8) Return windows for positive detections
return on_windows
# Read in cars and notcars
images = glob.glob('training_dataset/**/**/*.png')
cars = []
notcars = []
for image in images:
if 'non-vehicles' in image:
notcars.append(image)
elif 'vehicles' in image:
cars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# Reduce the sample size because
# The quiz evaluator times out after 13s of CPU time
#sample_size = 500
sample_size = 8500
cars = cars[0:sample_size]
notcars = notcars[0:sample_size]
### TODO: Tweak these parameters and see how the results change.
#color_space = 'RGB' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
color_space = 'YCrCb' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'HLS' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'HSV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'LUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#orient = 9 # HOG orientations
orient = 16 # HOG orientations
pix_per_cell = 8 # HOG pixels per cell
cell_per_block = 2 # HOG cells per block
#hog_channel = 0 # Can be 0, 1, 2, or "ALL"
hog_channel = "ALL" # Can be 0, 1, 2, or "ALL"
spatial_size = (16, 16) # Spatial binning dimensions
hist_bins = 16 # Number of histogram bins
#spatial_feat = True # Spatial features on or off
#hist_feat = True # Histogram features on or off
spatial_feat = False # Spatial features on or off
hist_feat = False # Histogram features on or off
hog_feat = True # HOG features on or off
#y_start_stop = [None, None] # Min and max in y to search in slide_window()
y_start_stop = [400, 800] # Min and max in y to search in slide_window()
car_features = extract_features(cars, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
notcar_features = extract_features(notcars, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
scaled_X, y, test_size=0.2, random_state=rand_state)
print('Using:',orient,'orientations',pix_per_cell,
'pixels per cell and', cell_per_block,'cells per block')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t=time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
image = mpimg.imread('test_images/test1.jpg')
#image = mpimg.imread('examples/bbox-example-image.jpg')
draw_image = np.copy(image)
# Uncomment the following line if you extracted training
# data from .png images (scaled 0 to 1 by mpimg) and the
# image you are searching is a .jpg (scaled 0 to 255)
image = image.astype(np.float32)/255
windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
xy_window=(96, 96), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(96, 96), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(64, 64), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(32, 32), xy_overlap=(0.5, 0.5))
hot_windows = search_windows(image, windows, svc, X_scaler, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
window_img = draw_boxes(draw_image, hot_windows, color=(0, 0, 255), thick=6)
fig = plt.figure()
fig = plt.figure(figsize=(30,30))
plt.imshow(window_img)
plt.show()
# Search and Classify
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from skimage.feature import hog
import pickle
#from lesson_functions import *
# NOTE: the next import is only valid for scikit-learn version <= 0.17
# for scikit-learn >= 0.18 use:
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation import train_test_split
bbox_pickle = {}
# Read in cars and notcars
images = glob.glob('training_dataset/**/**/*.png')
cars = []
notcars = []
for image in images:
if 'non-vehicles' in image:
notcars.append(image)
elif 'vehicles' in image:
cars.append(image)
else:
cars.append(image)
print("number of cars,notcars=", len(cars), len(notcars))
# Reduce the sample size because
# The quiz evaluator times out after 13s of CPU time
#sample_size = 500
sample_size = 8500
cars = cars[0:sample_size]
notcars = notcars[0:sample_size]
### TODO: Tweak these parameters and see how the results change.
#color_space = 'RGB' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
color_space = 'YCrCb' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'HLS' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'HSV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#color_space = 'LUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#orient = 8 # HOG orientations
#orient = 12 # HOG orientations
#orient = 16 # HOG orientations
orient = 24 # HOG orientations
#orient = 9 # HOG orientations
#orient = 5 # HOG orientations
#orient = 6 # HOG orientations
#orient = 32 # HOG orientations
#orient = 64 # HOG orientations
pix_per_cell = 8 # HOG pixels per cell
cell_per_block = 2 # HOG cells per block
#cell_per_block = 3 # HOG cells per block
#cell_per_block = 1 # HOG cells per block
#cell_per_block = 4 # HOG cells per block
#cell_per_block = 1 # HOG cells per block
#hog_channel = 0 # Can be 0, 1, 2, or "ALL"
hog_channel = "ALL" # Can be 0, 1, 2, or "ALL"
spatial_size = (16, 16) # Spatial binning dimensions
#hist_bins = 16 # Number of histogram bins
#hist_bins = 64 # Number of histogram bins
#hist_bins = 96 # Number of histogram bins
#hist_bins = 24 # Number of histogram bins
hist_bins = 32 # Number of histogram bins
spatial_feat = True # Spatial features on or off
hist_feat = True # Histogram features on or off
hog_feat = True # HOG features on or off
spatial_feat = False # Spatial features on or off
##hist_feat = False # Histogram features on or off
#hog_feat = True # HOG features on or off
#y_start_stop = [None, None] # Min and max in y to search in slide_window()
#y_start_stop = [400, 720] # Min and max in y to search in slide_window()
#y_start_stop = [400, 800] # Min and max in y to search in slide_window()
#y_start_stop = [500, 720] # Min and max in y to search in slide_window()
#y_start_stop = [300, 800] # Min and max in y to search in slide_window()
#y_start_stop = [200, 800] # Min and max in y to search in slide_window()
y_start_stop = [400, 800] # Min and max in y to search in slide_window()
#y_start_stop = [450, 800] # Min and max in y to search in slide_window()
#y_start_stop = [500, 800] # Min and max in y to search in slide_window()
car_features = extract_features(cars, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
notcar_features = extract_features(notcars, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
scaled_X, y, test_size=0.2, random_state=rand_state)
print('Using:',orient,'orientations',pix_per_cell,
'pixels per cell and', cell_per_block,'cells per block')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t=time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
#image = mpimg.imread('examples/bbox-example-image.jpg')
image = mpimg.imread('test_images/test1.jpg')
draw_image = np.copy(image)
# Uncomment the following line if you extracted training
# data from .png images (scaled 0 to 1 by mpimg) and the
# image you are searching is a .jpg (scaled 0 to 255)
image = image.astype(np.float32)/255
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(128, 128), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[200, None], y_start_stop=y_start_stop,
# xy_window=(96, 96), xy_overlap=(0.5, 0.5))
windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
xy_window=(96, 96), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(64, 64), xy_overlap=(0.5, 0.5))
#windows = slide_window(image, x_start_stop=[None, None], y_start_stop=y_start_stop,
# xy_window=(32, 32), xy_overlap=(0.5, 0.5))
hot_windows = search_windows(image, windows, svc, X_scaler, color_space=color_space,
spatial_size=spatial_size, hist_bins=hist_bins,
orient=orient, pix_per_cell=pix_per_cell,
cell_per_block=cell_per_block,
hog_channel=hog_channel, spatial_feat=spatial_feat,
hist_feat=hist_feat, hog_feat=hog_feat)
# Save the camera calibration result for later use (we won't worry about rvecs / tvecs)
#bbox_pickle = {}
bbox_pickle = hot_windows
pickle.dump( bbox_pickle, open( "bbox_pickle.p", "wb" ) )
window_img = draw_boxes(draw_image, hot_windows, color=(0, 0, 255), thick=6)
print("\nParameters:")
print("color_space: ", color_space)
print("orient: ", orient)
print("pix_per_cell: ", pix_per_cell)
print("cell_per_block: ", cell_per_block)
print("hog_channel: ", hog_channel)
print("spatial_size: ", spatial_size)
print("hist_bins: ", hist_bins)
print("spatial_feat: ", spatial_feat)
print("hist_feat: ", hist_feat)
print("hog_feat: ", hog_feat)
fig = plt.figure()
fig = plt.figure(figsize=(30,30))
plt.imshow(window_img)
plt.show()
# Save the Bounding Box Results
#bbox_pickle = {}
bbox_pickle = hot_windows
pickle.dump( bbox_pickle, open( "bbox_pickle.p", "wb" ) )
# Read in a pickle file with bboxes saved
# Each item in the "all_bboxes" list will contain a
# list of boxes for one of the images shown above
box_list = pickle.load( open( "bbox_pickle.p", "rb" ))
print ("box_list =", box_list)
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import pickle
import cv2
from scipy.ndimage.measurements import label
# Read in a pickle file with bboxes saved
# Each item in the "all_bboxes" list will contain a
# list of boxes for one of the images shown above
box_list = pickle.load( open( "bbox_pickle.p", "rb" ))
# Read in image similar to one shown above
image = mpimg.imread('test_images/test1.jpg')
#image = mpimg.imread('examples/bbox-example-image.jpg')
#image = mpimg.imread('test_image.jpg')
heat = np.zeros_like(image[:,:,0]).astype(np.float)
def add_heat(heatmap, bbox_list):
# Iterate through list of bboxes
for box in bbox_list:
# Add += 1 for all pixels inside each bbox
# Assuming each "box" takes the form ((x1, y1), (x2, y2))
heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1
# Return updated heatmap
return heatmap# Iterate through list of bboxes
def apply_threshold(heatmap, threshold):
# Zero out pixels below the threshold
heatmap[heatmap <= threshold] = 0
# Return thresholded map
return heatmap
def draw_labeled_bboxes(img, labels):
# Iterate through all detected cars
for car_number in range(1, labels[1]+1):
# Find pixels with each car_number label value
nonzero = (labels[0] == car_number).nonzero()
# Identify x and y values of those pixels
nonzeroy = np.array(nonzero[0])
nonzerox = np.array(nonzero[1])
# Define a bounding box based on min/max x and y
bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))
# Draw the box on the image
cv2.rectangle(img, bbox[0], bbox[1], (0,0,255), 6)
# Return the image
return img
# Add heat to each box in box list
heat = add_heat(heat,box_list)
# Apply threshold to help remove false positives
#heat = apply_threshold(heat,1)
heat = apply_threshold(heat,.5)
#heat = apply_threshold(heat,.75)
#heat = apply_threshold(heat,.99)
# Visualize the heatmap when displaying
heatmap = np.clip(heat, 0, 255)
# Find final boxes from heatmap using label function
labels = label(heatmap)
draw_img1 = draw_labeled_bboxes(np.copy(image), labels)
fig = plt.figure()
fig = plt.figure(figsize=(30,30))
plt.subplot(121)
plt.imshow(draw_img1)
plt.title('Car Positions')
plt.show()
fig = plt.figure()
fig = plt.figure(figsize=(30,30))
plt.subplot(122)
plt.imshow(heatmap, cmap='hot')
plt.title('Heat Map')
fig.tight_layout()
plt.show()
def get_hog_features(img, orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True):
# Call with two outputs if vis==True
if vis == True:
features, hog_image = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=False,
visualise=vis, feature_vector=feature_vec)
return features, hog_image
# Otherwise call with one output
else:
features = hog(img, orientations=orient,
pixels_per_cell=(pix_per_cell, pix_per_cell),
cells_per_block=(cell_per_block, cell_per_block),
transform_sqrt=False,
visualise=vis, feature_vector=feature_vec)
return features
car_img = mpimg.imread(car_images[5])
_, car_dst = get_hog_features(car_img[:,:,2], 9, 8, 8, vis=True, feature_vec=True)
noncar_img = mpimg.imread(noncar_images[5])
_, noncar_dst = get_hog_features(noncar_img[:,:,2], 9, 8, 8, vis=True, feature_vec=True)
# Visualize
f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(7,7))
f.subplots_adjust(hspace = .4, wspace=.2)
ax1.imshow(car_img)
ax1.set_title('Car Image', fontsize=16)
ax2.imshow(car_dst, cmap='gray')
ax2.set_title('Car HOG', fontsize=16)
ax3.imshow(noncar_img)
ax3.set_title('Non-Car Image', fontsize=16)
ax4.imshow(noncar_dst, cmap='gray')
ax4.set_title('Non-Car HOG', fontsize=16)
plt.show()
# Define a function to extract features from a list of image locations
# This function could also be used to call bin_spatial() and color_hist() (as in the lessons) to extract
# flattened spatial color features and color histogram features and combine them all (making use of StandardScaler)
# to be used together for classification
def extract_features(imgs, cspace='RGB', orient=9,
pix_per_cell=8, cell_per_block=2, hog_channel=0):
# Create a list to append feature vectors to
features = []
# Iterate through the list of images
for file in imgs:
# Read in each one by one
image = mpimg.imread(file)
# apply color conversion if other than 'RGB'
if cspace != 'RGB':
if cspace == 'HSV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
elif cspace == 'LUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)
elif cspace == 'HLS':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
elif cspace == 'YUV':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
elif cspace == 'YCrCb':
feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb)
else: feature_image = np.copy(image)
# Call get_hog_features() with vis=False, feature_vec=True
if hog_channel == 'ALL':
hog_features = []
for channel in range(feature_image.shape[2]):
hog_features.append(get_hog_features(feature_image[:,:,channel],
orient, pix_per_cell, cell_per_block,
vis=False, feature_vec=True))
hog_features = np.ravel(hog_features)
else:
hog_features = get_hog_features(feature_image[:,:,hog_channel], orient,
pix_per_cell, cell_per_block, vis=False, feature_vec=True)
# Append the new feature vector to the features list
features.append(hog_features)
# Return list of feature vectors
return features
# Feature extraction parameters
colorspace = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 11
pix_per_cell = 16
cell_per_block = 2
hog_channel = 'ALL' # Can be 0, 1, 2, or "ALL"
t = time.time()
car_features = extract_features(car_images, cspace=colorspace, orient=orient,
pix_per_cell=pix_per_cell, cell_per_block=cell_per_block,
hog_channel=hog_channel)
notcar_features = extract_features(noncar_images, cspace=colorspace, orient=orient,
pix_per_cell=pix_per_cell, cell_per_block=cell_per_block,
hog_channel=hog_channel)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to extract HOG features...')
# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)
# Fit a per-column scaler - this will be necessary if combining different types of features (HOG + color_hist/bin_spatial)
#X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
#scaled_X = X_scaler.transform(X)
# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))
# Split up data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=rand_state)
print('Using:',orient,'orientations',pix_per_cell,
'pixels per cell and', cell_per_block,'cells per block')
print('Feature vector length:', len(X_train[0]))
# Use a linear SVC
svc = LinearSVC()
# Check the training time for the SVC
t = time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')
# Define a single function that can extract features using hog sub-sampling and make predictions
def detect_vehicles3(img, svc, X_scaler, ystart, ystop, scale, colorspace, hog_channel,
orient, pix_per_cell, cell_per_block, spatial_size, hist_bins, show_all_rectangles=False):
#colorspace = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
#orient = 11
#pix_per_cell = 16
#cell_per_block = 2
#hog_channel = 'ALL' # Can be 0, 1, 2, or "ALL"
# array of rectangles where cars were detected
rectangles = []
img = img.astype(np.float32)/255
img_tosearch = img[ystart:ystop,:,:]
# apply color conversion if other than 'RGB'
if colorspace != 'RGB':
if colorspace == 'HSV':
ctrans_tosearch = cv2.cvtColor(img_tosearch, cv2.COLOR_RGB2HSV)
elif colorspace == 'LUV':
ctrans_tosearch = cv2.cvtColor(img_tosearch, cv2.COLOR_RGB2LUV)
elif colorspace == 'HLS':
ctrans_tosearch = cv2.cvtColor(img_tosearch, cv2.COLOR_RGB2HLS)
elif colorspace == 'YUV':
ctrans_tosearch = cv2.cvtColor(img_tosearch, cv2.COLOR_RGB2YUV)
elif colorspace == 'YCrCb':
ctrans_tosearch = cv2.cvtColor(img_tosearch, cv2.COLOR_RGB2YCrCb)
else: ctrans_tosearch = np.copy(image)
# rescale image if other than 1.0 scale
if scale != 1:
imshape = ctrans_tosearch.shape
ctrans_tosearch = cv2.resize(ctrans_tosearch, (np.int(imshape[1]/scale), np.int(imshape[0]/scale)))
# select colorspace channel for HOG
if hog_channel == 'ALL':
ch1 = ctrans_tosearch[:,:,0]
ch2 = ctrans_tosearch[:,:,1]
ch3 = ctrans_tosearch[:,:,2]
else:
ch1 = ctrans_tosearch[:,:,hog_channel]
# Define blocks and steps as above
nxblocks = (ch1.shape[1] // pix_per_cell)+1 #-1
nyblocks = (ch1.shape[0] // pix_per_cell)+1 #-1
nfeat_per_block = orient*cell_per_block**2
# 64 was the orginal sampling rate, with 8 cells and 8 pix per cell
window = 64
nblocks_per_window = (window // pix_per_cell)-1
cells_per_step = 2 # Instead of overlap, define how many cells to step
nxsteps = (nxblocks - nblocks_per_window) // cells_per_step
nysteps = (nyblocks - nblocks_per_window) // cells_per_step
# Compute individual channel HOG features for the entire image
hog1 = get_hog_features(ch1, orient, pix_per_cell, cell_per_block, feature_vec=False)
if hog_channel == 'ALL':
hog2 = get_hog_features(ch2, orient, pix_per_cell, cell_per_block, feature_vec=False)
hog3 = get_hog_features(ch3, orient, pix_per_cell, cell_per_block, feature_vec=False)
for xb in range(nxsteps):
for yb in range(nysteps):
ypos = yb*cells_per_step
xpos = xb*cells_per_step
# Extract HOG for this patch
hog_feat1 = hog1[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel()
if hog_channel == 'ALL':
hog_feat2 = hog2[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel()
hog_feat3 = hog3[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel()
hog_features = np.hstack((hog_feat1, hog_feat2, hog_feat3))
else:
hog_features = hog_feat1
xleft = xpos*pix_per_cell
ytop = ypos*pix_per_cell
test_prediction = svc.predict(hog_features)
if test_prediction == 1 or show_all_rectangles:
xbox_left = np.int(xleft*scale)
ytop_draw = np.int(ytop*scale)
win_draw = np.int(window*scale)
rectangles.append(((xbox_left, ytop_draw+ystart),(xbox_left+win_draw,ytop_draw+win_draw+ystart)))
return rectangles
Answer: I used a sliding window to search for cars and draw bounding boxes around cars found.
The sliding window acts essentially like a scan where as a windows slides across the image,
the overlapping tiles in each test image are classified as vehicle or non-vehicle.
One key aspect of the sliding window is scale with respect to perspective. The idea is that the farther away an object is, the scale will decrease. The closer the object is, the scale will increase. Another factor is the amount of overlap between windows. I chose an overlap of 0.5.
For the sliding window, I used the following functions: slide_window, search_windows, and draw_labeled_bboxes.
Since I am looking for vehicles, I restrict the search using y_start and y_stop to limit the search area in the image only to where the vehicles might appear. This has the same effect as cropping out the sky and speeds up the search.
Here is an example search with different window positions at different scales all over the image.

Answer:
I improved the reliability of the classifier in the following way.
To reduce the number of false positives, I applied a heat map and a threshold. This resulted in fewer false positives and more reliable car detections.
Here are six consecutive frames from the project video and I'm showing all the bounding boxes for where my classifier reported positive detections. There are overlapping detections exist for each of the two vehicles. In two of the frames, I find a false positive detection on the guardrail to the left. I build a heat-map from these detections in order to combine overlapping detections and remove false positives.

To make the heat map, I add "heat" (+=1) for all pixels within windows where a positive detection is reported by the classifier. The individual heat-maps for the above images look like this:

test_img = mpimg.imread('./test_images/test1.jpg')
ystart = 400
ystop = 656
scale = 1.5
colorspace = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 11
pix_per_cell = 16
cell_per_block = 2
hog_channel = 'ALL' # Can be 0, 1, 2, or "ALL"
rectangles = detect_vehicles3(test_img, svc, None, ystart=400, ystop=656, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=False)
# Here is your draw_boxes function from the previous exercise
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
imcopy = np.copy(img)
random_color = False
# Iterate through the bounding boxes
for bbox in bboxes:
if color == 'random' or random_color:
color = (np.random.randint(0,255), np.random.randint(0,255), np.random.randint(0,255))
random_color = True
# Draw a rectangle given bbox coordinates
cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return imcopy
test_img_rects = draw_boxes(test_img, rectangles)
#test_img_rects1 = draw_boxes(test_img, rects)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
plt.show()
test_img = mpimg.imread('./test_images/test1.jpg')
rects = []
rects.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=464, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects.append(detect_vehicles3(test_img, svc, None, ystart=416, ystop=480, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rectangles = [item for sublist in rects for item in sublist]
test_img_rects = draw_boxes(test_img, rectangles, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
test_img = mpimg.imread('./test_images/test1.jpg')
rects = []
rects.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=496, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects.append(detect_vehicles3(test_img, svc, None, ystart=432, ystop=528, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rectangles = [item for sublist in rects for item in sublist]
test_img_rects = draw_boxes(test_img, rectangles, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
test_img = mpimg.imread('./test_images/test1.jpg')
rects = []
rects.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=496, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects.append(detect_vehicles3(test_img, svc, None, ystart=432, ystop=528, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects = [item for sublist in rects for item in sublist]
test_img_rects = draw_boxes(test_img, rects, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
test_img = mpimg.imread('./test_images/test1.jpg')
rects = []
rects.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=528, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects.append(detect_vehicles3(test_img, svc, None, ystart=432, ystop=560, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects = [item for sublist in rects for item in sublist]
test_img_rects = draw_boxes(test_img, rects, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
test_img = mpimg.imread('./test_images/test1.jpg')
rects = []
rects.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=596, scale=3.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rects.append(detect_vehicles3(test_img, svc, None, ystart=464, ystop=660, scale=3.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=True))
rectangles = [item for sublist in rects for item in sublist]
test_img_rects = draw_boxes(test_img, rectangles, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
test_img = mpimg.imread('./test_images/test1.jpg')
rectangles = []
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=464, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=416, ystop=480, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=496, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=432, ystop=528, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=528, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=432, ystop=560, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=400, ystop=596, scale=3.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(test_img, svc, None, ystart=464, ystop=660, scale=3.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
# apparently this is the best way to flatten a list of lists
rectangles = [item for sublist in rectangles for item in sublist]
test_img_rects = draw_boxes(test_img, rectangles, color=(0,0,255), thick=2)
plt.figure(figsize=(10,10))
plt.imshow(test_img_rects)
#def add_heat(heatmap, bbox_list):
# Iterate through list of bboxes
# for box in bbox_list:
# Add += 1 for all pixels inside each bbox
# Assuming each "box" takes the form ((x1, y1), (x2, y2))
# heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1
# Return updated heatmap
# return heatmap
# Test out the heatmap
heatmap_img = np.zeros_like(test_img[:,:,0])
heatmap_img = add_heat(heatmap_img, rectangles)
plt.figure(figsize=(10,10))
plt.imshow(heatmap_img, cmap='hot')
def apply_threshold(heatmap, threshold):
# Zero out pixels below the threshold
heatmap[heatmap <= threshold] = 0
# Return thresholded map
return heatmap
heatmap_img = apply_threshold(heatmap_img, 1)
plt.figure(figsize=(10,10))
plt.imshow(heatmap_img, cmap='hot')
labels = label(heatmap_img)
plt.figure(figsize=(10,10))
plt.imshow(labels[0], cmap='gray')
print(labels[1], 'cars found')
def draw_labeled_bboxes(img, labels):
# Iterate through all detected cars
rects = []
for car_number in range(1, labels[1]+1):
# Find pixels with each car_number label value
nonzero = (labels[0] == car_number).nonzero()
# Identify x and y values of those pixels
nonzeroy = np.array(nonzero[0])
nonzerox = np.array(nonzero[1])
# Define a bounding box based on min/max x and y
bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))
rects.append(bbox)
# Draw the box on the image
cv2.rectangle(img, bbox[0], bbox[1], (0,0,255), 6)
# Return the image and final rectangles
return img, rects
# Draw bounding boxes on a copy of the image
draw_img, rect = draw_labeled_bboxes(np.copy(test_img), labels)
# Display the image
plt.figure(figsize=(10,10))
plt.imshow(draw_img)
Answer:
I used a sliding-window search along with an SVC Classifier that identifies vehicles in an image. I used an image processing pipeline to proess the project video and draw bounding boxes around vehicles detected.
After experimenting with a number of different parameters, I found the following parameters worked well:
I searched on multiple scales using YUV colorspace HOG features plus spatially binned color and histograms of color in the feature vector. Here are some example images:

Answer:
Multiple Detections and False Positives
To filter out multiple detections and false positives, I used heat maps that are found at the same position in subsequent frames. The heat maps showing the location of repeat detections reduced the number of false positives. To reduce multiple detections, I also used heat maps and thresholds to draw bounding boxes around areas where there were multiple overlapping detections.
Here's a link to my video result
Answer:
I recorded the positions of positive detections in each frame of the video. From the positive detections I created a heatmap and then thresholded that map to identify vehicle positions. I then used scipy.ndimage.measurements.label() to identify individual blobs in the heatmap. I then assumed each blob corresponded to a vehicle. I constructed bounding boxes to cover the area of each blob detected.
Here's an example result showing the heatmap from a series of frames of video, the result of scipy.ndimage.measurements.label() and the bounding boxes then overlaid on the last frame of video:

scipy.ndimage.measurements.label() on the integrated heatmap from all six frames:¶

def process_frame(img):
rectangles = []
rectangles.append(detect_vehicles3(img, svc, None, ystart=400, ystop=464, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=416, ystop=480, scale=1.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=400, ystop=496, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=432, ystop=528, scale=1.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=400, ystop=528, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=400, ystop=560, scale=2.0, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=400, ystop=596, scale=3.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles.append(detect_vehicles3(img, svc, None, ystart=464, ystop=660, scale=3.5, colorspace='YUV', hog_channel='ALL',
orient=11, pix_per_cell=16, cell_per_block=2, spatial_size=None, hist_bins=None, show_all_rectangles=None))
rectangles = [item for sublist in rectangles for item in sublist]
heatmap_img = np.zeros_like(img[:,:,0])
heatmap_img = add_heat(heatmap_img, rectangles)
heatmap_img = apply_threshold(heatmap_img, 1)
labels = label(heatmap_img)
draw_img, rects = draw_labeled_bboxes(np.copy(img), labels)
return draw_img
test_images = glob.glob('./test_images/test*.jpg')
fig, axs = plt.subplots(3, 2, figsize=(16,14))
fig.subplots_adjust(hspace = .004, wspace=.002)
axs = axs.ravel()
for i, im in enumerate(test_images):
axs[i].imshow(process_frame(mpimg.imread(im)))
axs[i].axis('off')
from moviepy.editor import VideoFileClip
from IPython.display import HTML
test_out_file = 'project_video_out.mp4'
clip_test = VideoFileClip('project_video.mp4')
clip_test_out = clip_test.fl_image(process_frame)
%time clip_test_out.write_videofile(test_out_file, audio=False)
HTML("""
<video width="960" height="540" controls>
<source src="{0}">
</video>
""".format(test_out_file))
Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?
Answer:
Here are some of the problems I ran into working on the Vehicle Detection project. With certain parameters such as color space, the SVC classifier would detect dark color cars, but not detect white cars. The classifier would also have false positives where it would detect dark patches from trees as cars and subsequently draw bounding boxes around them. I knew I could restrict the search space to exclude the side of the sky and the side of the road to eliminate false positives. However, I wanted to experiment with parameters such as color space so that the classifier could learn to eliminate these mistakes on its own.
Another issue I ran into was the bounding boxes some times did not enclose the vehicle entirely. For some cars, the classifier enclosed the vehicle entirely, but the bounding boxes were too loose and needed to be tighter.
I experimented with a number of HOG parameters, including color space, orientation, y_start and y_stop, pix_per_cells, cells_per_blocks and number of HOG channels. I experimented with a number of sliding window parameters, including scale, constraining search areas and overlaping regions.
To improve my pipeline, I would go back and add historical statistical "memory". This would make the vehicle detection more robust because if for some reason the detection is lost in a frame, historical positions could be used to prevent losing track of other cars and vehicles on the road.